简体中文  |  English
  • MGE Specialized Databases

  • Su Yanjing
  • Bai Yang
  • Qian Ping

Su Yanjing Professor of University of Science and Technology Beijing, and chief expert on the National Key R&D Program

Bai Yang Professor of the University of Science and Technology Beijing, supported through the National Ten-Thousand Talents Program for Young Top-Notch Talents

Qian Ping Professor of the University of Science and Technology Beijing

Chief Members

Su Yanjing  Professor of University of Science and Technology Beijing

Bai Yang      Professor of the University of Science and Technology Beijing

Qian Ping   Professor of the University of Science and Technology Beijing

Research Background

Database and big data technology is one of the three supporting platforms and key technologies of Materials Genome Engineering (MGE), which not only supports and serves high-throughput computing and high-throughput experiments by meeting their data requirements, but can also accumulate data through them so as to integrate with data mining technologies and serve in the design of new materials. To date, the integration of material data and machine learning has become one of the fastest growing and most promising areas in the field of MGE. The team is committed to building an MGE database and exploring Artificial Neural Network (ANN)-based computing methods and software that support high-throughput computing of interatomic potential functions. The team also aims to carry out research on material machine learning algorithms and software, as well as their application technologies.

Research Objectives

While focusing on the data requirements of MGE, the team is committed to building a dedicated database system that supports and serves the development of MGE. This means building the architecture and hardware conditions needed for a database dedicated to MGE, and generating an MGE database with a considerable amount of data. The team will establish a library of interatomic potential functions embedded with machine learning algorithms and develop potential function algorithms based on lattice inversion. In view of the current needs in material design, the team will develop material machine learning algorithms and software. The newly developed databases and data technologies will be applied to high-temperature alloys, high-entropy alloys, and ferroelectric materials. The objectives are to optimize and enhance the material properties and to reveal the material structure-property relationships, to eventually improve the efficiency of material R&D.

Main research areas

1. Technical R&D and construction of an MGE-dedicated database

2. Machine learning algorithms for interatomic potential functions and a library of potential functions

3. Material machine learning algorithms and applications 

Significant Research Progress

1. MGE-dedicated database system

The MGE-dedicated database supports the data requirements of high-throughput computing and experiments. When integrated with data mining technology, the database will lead to the discovery of new knowledge and rules, serving roles in the R&D and design of new materials. This project is aimed at researching and developing technology for high-throughput computing as well as automatic acquisition and processing technology for complex and heterogenous experimental data. It also aims to build an MGE-dedicated database architecture through integration and fusion with data mining technology.

The team has developed material database technology based on schema-free storage to meet the demand for flexible expansion and data mining of complex heterogeneous data. The aim is to overcome the obstacle to collection of multi-source heterogeneous material data by adopting dynamic data containers, and to build an integrated system framework allowing scalability of database/application software. The team developed high-throughput first-principles computing software for materials based on cloud computing. The latter, which realizes automatic generation of batch operations, automatic processing and analysis of computing results, as well as automatic data collection, has given China high-throughput first-principles computing software with proprietary intellectual property rights. The team has integrated 12 common data-mining algorithms to build a material data-mining and computing network platform incorporating databases, while achieving automatic filling of feature variables and network sharing of machine learning models. The team has also developed high-throughput technology and software for material x-ray diffraction and image processing, thereby realizing the automatic acquisition, processing, and storage of typical experimental data on materials. Through integration of the above systems, a database system dedicated to MGE has been initially established (www.MGEdata.cn).

宿1.png

 

 

2. Machine learning-based design of high-performance high-entropy alloys

Research was conducted on high-entropy six-element Al-Co-Cr-Cu-Fe-Ni alloys, leading to a proposal for a high-entropy alloy optimization design technique that combines machine learning, experimental design, and feedback. The technique relies on global optimization methods to optimize alloy screening in a complex component space, and by using iterative experimental feedback to strengthen the model, it enables accelerated design of the composition of high-hardness, high-entropy, six-element alloys. Based on machine learning algorithms, the mapping relationship of the target attribute (hardness) is established, with respect to the chemical composition of the elements and to the material property descriptors. Based on the machine learning prediction and related uncertainty analysis, utility functions are used to predict new candidate components for experimental verification, followed by adding the test data of the synthesized samples to the training data set, and repeating the above procedure. Through such feedback loops, rapid optimization of the target attribute was achieved. In the high-entropy six-element alloy system, 42 alloys have been newly refined, of which 35 were found to have hardness higher than the highest value of the original training set of alloys, and 17 have more than 10% higher hardness than the highest value of the training set. At the same time, it was found that the combination of domain knowledge and chemical composition as descriptors leads to more rapid discovery of materials with better properties. The machine learning algorithm can also be extended to design multi-component alloys such as bulk amorphous alloys and high-temperature alloys.

宿2.png

Publications
版权所有©北京科技大学 建设与技术支持:信息化建设与管理办公室 京公网安备:110402430062 京ICP备:13030111